Search CORE

33 research outputs found

Embedded Large-Scale Handwritten Chinese Character Recognition

Author: Bellegarda Jerome R.
Chherawala Youssouf
Dixon Ryan S.
Dolfing Hans J. G. A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/04/2020
Field of study

As handwriting input becomes more prevalent, the large symbol inventory required to support Chinese handwriting recognition poses unique challenges. This paper describes how the Apple deep learning recognition system can accurately handle up to 30,000 Chinese characters while running in real-time across a range of mobile devices. To achieve acceptable accuracy, we paid particular attention to data collection conditions, representativeness of writing styles, and training regimen. We found that, with proper care, even larger inventories are within reach. Our experiments show that accuracy only degrades slowly as the inventory increases, as long as we use training data of sufficient quality and in sufficient quantity.Comment: 5 pages, 7 figure

arXiv.org e-Print Archive

Crossref

On the dynamic adaptation of language models based on dialogue information

Author: Bacchiani
Bellegarda
Darroch
F. Fernández-Martı´nez
Federico
J. Ferreiros
J.D. Echeverry
J.M. Lucas-Cuesta
Justo
Kuhn
Landauer
Lucas-Cuesta
Lucas-Cuesta
López-Cózar
López-Cózar
Manning
Martins
Riccardi
S. Lutfi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

We present an approach to adapt dynamically the language models (LMs) used by a speech recognizer that is part of a spoken dialogue system. We have developed a grammar generation strategy that automatically adapts the LMs using the semantic information that the user provides (represented as dialogue concepts), together with the information regarding the intentions of the speaker (inferred by the dialogue manager, and represented as dialogue goals). We carry out the adaptation as a linear interpolation between a background LM, and one or more of the LMs associated to the dialogue elements (concepts or goals) addressed by the user. The interpolation weights between those models are automatically estimated on each dialogue turn, using measures such as the posterior probabilities of concepts and goals, estimated as part of the inference procedure to determine the actions to be carried out. We propose two approaches to handle the LMs related to concepts and goals. Whereas in the first one we estimate a LM for each one of them, in the second one we apply several clustering strategies to group together those elements that share some common properties, and estimate a LM for each cluster. Our evaluation shows how the system can estimate a dynamic model adapted to each dialogue turn, which helps to improve the performance of the speech recognition (up to a 14.82% of relative improvement), which leads to an improvement in both the language understanding and the dialogue management tasks

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

DSCo-NG: A Practical Language Modeling Approach for Time Series Classification

Author: E Keogh
J Lin
J Serrà
JR Bellegarda
MG Baydogan
P Senin
PF Marteau
Q Wang
TC Fu
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2016
Field of study

The abundance of time series data in various domains and their high dimensionality characteristic are challenging for harvesting useful information from them. To tackle storage and processing challenges, compression-based techniques have been proposed. Our previous work, Domain Series Corpus (DSCo), compresses time series into symbolic strings and takes advantage of language modeling techniques to extract from the training set knowledge about different classes. However, this approach was flawed in practice due to its excessive memory usage and the need for a priori knowledge about the dataset. In this paper we propose DSCo-NG, which reduces DSCo’s complexity and offers an efficient (linear time complexity and low memory footprint), accurate (performance comparable to approaches working on uncompressed data) and generic (so that it can be applied to various domains) approach for time series classification. Our confidence is backed with extensive experimental evaluation against publicly accessible datasets, which also offers insights on when DSCo-NG can be a better choice than others

Crossref

Open Repository and Bibliography - Luxembourg

Enhanced multiclass SVM with thresholding fusion for speech-based emotion classification

Author: B Roberto
CH Wu
CM Lee
D Bitouk
DA Sauter
H Peng
Ilker Demirkol
J Rong
JC Platt
Jianbo Yuan
JR Bellegarda
KR Scherer
KR Scherer
M Hoque
M Liberman
Melissa Sturge-Apple
MP Black
N Yang
Na Yang
NHD Jong
NV Chawla
P Kerig
R Bakeman
R Barra-Chicote
S Yun
T Bänziger
VN Vapnik
VN Vapnik
Wendi Heinzelman
X Huang
Y Yang
Yun Zhou
Zhiyao Duan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

As an essential approach to understanding human interactions, emotion classification is a vital component of behavioral studies as well as being important in the design of context-aware systems. Recent studies have shown that speech contains rich information about emotion, and numerous speech-based emotion classification methods have been proposed. However, the classification performance is still short of what is desired for the algorithms to be used in real systems. We present an emotion classification system using several one-against-all support vector machines with a thresholding fusion mechanism to combine the individual outputs, which provides the functionality to effectively increase the emotion classification accuracy at the expense of rejecting some samples as unclassified. Results show that the proposed system outperforms three state-of-the-art methods and that the thresholding fusion mechanism can effectively improve the emotion classification, which is important for applications that require very high accuracy but do not require that all samples be classified. We evaluate the system performance for several challenging scenarios including speaker-independent tests, tests on noisy speech signals, and tests using non-professional acted recordings, in order to demonstrate the performance of the system and the effectiveness of the thresholding fusion mechanism in real scenarios.Peer ReviewedPreprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis

Author: A Ben-Hur
A Floratos
AR Shah
B Qian
B Rost
B-J Webb-Robertson
Bin Liu
C Leslie
CG Nevill-Manning
CS Leslie
H Ogul
H Rangwala
H Saigo
I Rigoutsos
J Bellegarda
J Shawe-Taylor
K Karplus
L Holm
L Liao
Lei Lin
M Ganapathiraju
M Gribskov
Q Dong
Q Dong
Q Dong
Q Dong
Q Dong
Qiwen Dong
QJ Su
QW Dong
R Kuang
S Henikoff
SE Brenner
SE Dowd
SF Altschul
SF Altschul
T Damoulas
T Håndstad
T Jaakkola
T Lingner
TF Smith
TK Landauer
TL Bailey
VN Vapnik
WR Pearson
WS Noble
Xiaolong Wang
Xuan Wang
Y Hou
Y Hou
Y Yang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. A key step to improve the performance of the SVM-based methods is to find a suitable representation of protein sequences. Results In this paper, a novel building block of proteins called Top-<it>n</it>-grams is presented, which contains the evolutionary information extracted from the protein sequence frequency profiles. The protein sequence frequency profiles are calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into Top-<it>n</it>-grams. The protein sequences are transformed into fixed-dimension feature vectors by the occurrence times of each Top-<it>n</it>-gram. The training vectors are evaluated by SVM to train classifiers which are then used to classify the test protein sequences. We demonstrate that the prediction performance of remote homology detection and fold recognition can be improved by combining Top-<it>n</it>-grams and latent semantic analysis (LSA), which is an efficient feature extraction technique from natural language processing. When tested on superfamily and fold benchmarks, the method combining Top-<it>n</it>-grams and LSA gives significantly better results compared to related methods. Conclusion The method based on Top-<it>n</it>-grams significantly outperforms the methods based on many other building blocks including N-grams, patterns, motifs and binary profiles. Therefore, Top-<it>n</it>-gram is a good building block of the protein sequences and can be widely used in many tasks of the computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the prediction of protein binding sites.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Getting Past the Language Gap: Innovations in Machine Translation

Author: A Fraser
A Maletti
AB Phillips
C David
C España-Bonet
D Chiang
D Wu
F Jelinek
F Sanchez-Martinez
FJ Och
FJ Och
GS Matthew
H Somers
HM Caseli
Huang Liang Hao Zhang, Daniel Gildea, Kevin Knight
I Alegria
I Cicekli
J Bellegarda
J Bellegarda
J Hutchins
K Baker
K Baker
K Owczarzak
L Levin
M Ashburner
M Bisani
N Collier
N Habash
N Ueffing
OF Josef
OF Josef
P Cimiano
P Vossen
S Ravi
T Green
Tong Xiao
V Pekar
W Mischo
Wei Wang
WJ Hutchins
Y Wilks
Y Wilks
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

Archivio Ricerca Ca'Foscari

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari